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SununarN 


This  project  was  initiated  to  test  the  feasibility  of  designing  a 
close-coupled,  two-way  communication  link  between  man  and  computer  using 
biological  information.  Specifically,  experiments  have  been  devised  to 
determine  whether  biological  information  can  be  related  to  human  thought, 
whether  this  information  can  be  processed  meaningfully  by  a computer,  and 
whether  similar  biological  processes  representing  the  same  or  other  thoughts 
can  be  Induced  in  the  same  or  another  individual.  Should  such  a close- 
coupling  between  man  and  machine  prove  to  be  feasible,  possible  applications 
would  include  extremely  rapid  interactive  processing  between  a man  and  a 
computer,  or  communication  between  two  or  more  persons  with  the  computer 
acting  as  an  interface.  ^ 

The  research  plan  was  predicated  on  existing  evidence  that  verbal 
ideas  or  thinking  are  subvocally  represented  in  the  facial  muscles  of  the 
vocal  apparatus  (see  Rationale  of  Approach,  p.  3,  for  details).  If  the 
patterns  of  this  muscle  activity  are  at  all  similar  to  those  involved  in 
normal  overt  speech,  then  it  is  reasonable  to  assume  that  the  electrical 
activity  of  the  brain  during  covert  speech  (verbal  thinking)  may  be  simi- 
lar to  that  during  overt  speech.  The  objective  of  the  first  year  of  re- 
search is  to  establish  the  validity  of  this  basic  premise. 

The  general  methodology  was  to  record  the  electromyograph  (EMG)  of 
facial  muscles  Involved  in  speech  from  volunteer  human  subjects  during 
performance  of  language  tasks.  The  electroencephalograph  (EEC)  from 
scalp  electrodes  overlying  areas  of  the  cerebral  cortex  involved  in  speech 
were  recorded  simultaneously  (see  Methods,  p.  5,  for  details).  The  resulting 
analog  data  were  then  digitized  for  computer  processing,  and  several 
statistics  that  reveal  patterns  of  cortical  activity  were  calculated. 

These  statistics  were  then  used  in  a computer  pattern  recognition  program 
designed  to  identify  features  in  the  physiological  data  associated  with 
specific  words,  whether  overtly  or  covertly  produced. 


Two  experimental  paradigms  were  used  (see  Results;  Experiment  1, 
p.l3,  and  Results;  Experiment  2,  p.29,  for  details).  In  the  first,  EMG 
and  EEG  records  were  obtained  during  performance  of  a language  task  under 
various  conditions  of  stimulus  presentation  Including;  visual  presentation, 
overt  response;  visual  presentation,  covert  response  (silent  reading); 
auditory  presentation,  eyes  open,  overt  response;  and  auditory  presentation, 
eyes  closed,  overt  response.  (The  last  two  conditions  were  chosen  because 
the  EEG  is  characteristically  different  when  the  eyes  are  open  compared 
with  closed.)  The  language  task,  recommended  by  a psychophyslologlcal 
linguist  consultant,  consisted  of  words  and  sentences  most  likely  to  re- 
veal patterns  in  the  EMG  and  EEG  that  may  be  related  to  speech  and  verbal 
thinking.  In  the  second  experiment,  similar  records  were  obtained,  but 
under  slightly  different  stimulus  conditions  and  with  20  repetitions  per 
subject  for  reliability  tests.  These  conditions  Included  visual  presen- 
tation with  overt  response  of  five  selected  monosyllabic  words,  and  five 
bisyllabic  words  with  the  accent  first  on  one  syllable  and  then  on  the 
other  to  compare  ordering  effects  caused  by  emphasis. 


Computer  processing  of  these  data  has  not  been  completed,  but  signi- 
ficant results  to  date  are: 

(1)  EMG  patterns  for  each  word  are  specific  for  that  word. 

(2)  EMG  patterns  for  a given  word  are  consistent,  showing  less 
within  subject  variability  than  between  subject  variability. 

(3)  Averaged  EMG  patterns  for  a given  word  spoken  by  a given  indi- 
vidual are  sufficiently  consistent  for  that  averaged  EMG  to 
serve  as  a template  for  identifying  the  same  word  when  it  is 
imbedded  in  a sentence. 

(4)  There  is  some  variability  in  EMG  patterns  for  bisyllablc  words 
between  accent  on  the  first  or  second  syllable,  but  it  is 
sufficiently  small  so  that  either  pattern  may  be  used  to  identify 
the  same  unaccented  word  imbedded  in  a sentence. 

(5)  "Raw"  EEG  patterns  for  silently  read  words  are  similar  to  those 
for  the  same  overtly  read  words,  so  that  the  onset  and  ending 
of  silent  reading  may  be  identified  by  visual  observation. 

This  strongly  suggests  that  the  statistical  data  of  the  EEG 
during  verbal  thinking  should  be  similar  to  the  -statistical 
data  of  the  EEG  during  vocalization  of  the  same  thought. 

(6)  The  pattern  recognition  analysis  so  far  carried  out  has  been 
able  to  distinguish  words  beginning  with  "h"  from  all  other 
words,  and  to  define  five  clusters  for  five  word  sets. 

In  conclusion,  the  results  so  far  show  that  the  E3iG  may  be  used  to 
identify  specific  overtly  spoken  words  of  a given  individual.  The  results 
also  imply  that  patterns  of  EEG  activity  associated  with  these  words  may 
be  used  to  identify  the  same  word  when  covertly  produced  (as  in  verbal 
thinking).  Completion  of  the  EEG  computer  pattern  recognition  analysis 
during  the  remainder  of  this  contract  year  should  definitely  establish 
the  validity  or  nonvalidity  of  this  implication.  Research  during  the 
second  year  will  examine  all  possible  EEG  locations  to  identify  those 
that  best  serve  to  make  such  identifications  on-line. 

No  important  items  of  equipment  were  purchased  or  developed  during 
this  period,  and  there  were  no  major  technical  problems. 
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Introduction  and  Significance  of  Research 


This  project  was  initiated  to  test  the  feasibility  of  designing  a 
close-coupled,  two-way  communication  link  between  man  and  computer  using 
biological  information.  Specifically,  experiments  have  been  devised  to 
determine  whether  this  information  can  be  processed  meaningfully  by  a 
computer,  and  whether  similar  biological  processes  representing  the  same 
or  other  thoughts  can  be  Induced  in  the  same  or  another  individual. 

Should  such  a close-coupling  between  man  and  machine  prove  to  be 
feasible,  possible  applications  would  Include  extremely  rapid  interactive 
processing  between  a man  and  a computer,  or  communication  between  two 
or  more  persons  with  the  computer  acting  as  an  interface.  For  example, 
an  Individual  using  such  a blocybernetlc  communication  system  would  be 
able  to  "talk"  (i.e.,  both  send  and  receive)  with  a computer  at  the  speed 
of  thought,  rather  than  be  limited  by  the  speed  of  a teletype  or  other 
electromechanical  device  through  which  ideas  in  the  form  of  questions  and 
answers  must  normally  pass.  In  addition,  nonverbal  Imagery  and  affective 
(emotional  or  'feeling  ) states  might  similarly  be  used  in  the  communication 
process,  thereby  significantly  increasing  the  bandwidth  of  information 
transfer.  Furthermore,  two  or  more  individuals,  separated  by  short  or 
long  distances,  would  have  the  capability  of  rapid  and  accurate  communi- 
cation with  a high  degree  of  immunity  to  decoding  if  the  signals  were 
Intercepted,  where  information  transfer  might  be  more  complete  than  with 
normal  speech. 

Rationale  of  Approach 

Our  approach  is  predicated  on  previous  research  conducted  by  the 
authors  and  others  in  the  areas  of  psychophysiologlcal  measures  of  thought, 
computer  processing  of  electrophysiologlcal  information,  and  development 
of  computer  pattern  recognition  techniques.  This  research  may  be  summarized 
as  follows  (see  SRI  Proposal  LSU  71-145  to  DARPA,  dated  10  December  1971, 
for  details). 

Early  work  by  Watson  (1930)  indicated  that  verbal  cognitive  processes 
may  be  represented  in  muscle  activity  of  the  vocal  apparatus  as  subvocal 
speech.  McGuigan  (1970),  reviewing  studies  of  such  covert  oral  behavior 
during  the  silent  performance  of  a language  task,  concludes  that  covert 
oral  behavior  (as  measured  by  the  electromyograph , or  EMC)  Increases 
significantly  in  amount  and  frequency  of  occurrence.  Thus,  verbal  ideas 
or  thinking,  although  unquestionably  a central  nervous  system  process 
(MacNellage  and  MacNeilage,  1971),  has  some  sort  of  peripheral  represen- 
tation in  the  muscles  of  the  vocal  apparatus. 

If  the  patterns  of  this  muscle  activity  are  at  all  similar  to  those 
involved  in  normal  overt  speech,  then  It  Is  reasonable  to  assume  that  the 
electrical  activity  of  the  brain  during  covert  speech,  or  thinking,  may 
be  similar  to  that  during  overt  speech.  That  is,  a measure  of  the  scalp- 
recorded  electroencephalograph  (EEC)  of  a human  during  verbal  thinking 
should  be  similar  to  the  EEC  of  the  same  Individual  when  expressing  the 
same  thoughts  vocally. 
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However,  examination  of  the  "raw"  EEG  has  not  revealed  any  obvious 
pattern  related  to  overt  or  covert  speech;  it  may  be  that  only  patterns 
of  EEG  activity  between  various  areas  of  the  brain  at  a given  moment  are 
related  to  speech.  Several  technical  advances  made  in  recent  years  have 
provided  us  with  some  tools  to  deal  with  this  possibility.  Most  important 
is  the  use  of  computer  techniques  for  frequency  analysis  of  the  real-time 
EEG  and  the  development  of  multivariate  statistical  procedures  (Donchin  and 
Linds  ley,  1966;  John  e^  a_l.  , 1964;  and  Rose  and  Linds  ley,  1965),  These 
procedures  allow  comparison  of  specific  components  of  EEG  waveforms  that 
are  known  to  reflect  different  neurophysiological  processes.  In  addition, 
certain  statistics,  such  as  auto-  and  cross-sprectral  frequency  analysis 
(Walter,  1963;  Walter  and  Adey,  1965),  linear  coherence  function  (Adey, 

Kado  and  Walter,  1967),  and  the  weighted-average  coherence  (Galbraith,  1967), 
may  be  used  to  determine  the  degree  of  interaction  between  two  different 
brain  regions.  Thus,  with  these  tools,  the  EEG  waveforms  from  several 
areas  of  the  brain  that  are  neurophysiologically  related  to  speech  may 
be  examined  to  determine  if  their  patterns  or  interaction  are  similar  during 
overt  speech  and  verbal  thinking. 

A thorough  visual  analysis  of  the  statistical  results  of  these  EEG 
waveforms  would  be  extremely  complicated- and  time-consuming;  certainly 
on-line  visual  analy.sis  of  verbal  thinking  would  not  be  possible.  There- 
fore, we  have  turned  to  machine  pattern  recognition  techniques  to  analyze 
the  patterns  of  the  EEG  interrelationships  to  be  found  in  the  cross-spectra 
and  coherence  functions  related  to  covert  and  overt  speech.  Most  useful 
for  this  feasibility  study  are  techniques  for  on-line  pattern  recognition 
using  interactive  graphic  displays  (Hall  et  al, , 1968).  These  techniques 
allow  the  user  to  process  multivariate  data  by  using  all  reasonably  con- 
ceivable graphic  plots,  and  further  manipulate  the  data  using  appropriate 
numeric  procedures  available  in  the  computer  system.  Thus,  for  our  purposes, 
a set  of  statistics  such  as  the  coherence  functions  of  the  EEG,  the  patterns 
of  the  EMG  changes  with  overt  speech,  and  other  measures  may  be  plotted 
as  a function  of  each  other  for  specific  covert  language  tasks  (i.e.,  thinking). 

The  objective  of  the  first  year  of  this  feasibility  study,  then,  i/; 
to  establish  the  validity  of  the  basic  premise  that  patterns  of  biological 
Inlormation  can  be  related  to  covert  language  behavior.  This  is  being 
done  by: 

(1)  Measurement  of  EMGs  of  the  vocal  apparatus  and  EEGs  overlying 

cerebral  areas  involved  in  speech  during  overt  and  covert 
language  tasks.  * 

(2)  Computer  processing  of  the  averaged  biological  activity,  and 
analysis  of  the  cross-  and  auto-spectra,  coherence,  and  weighted- 
average  coherence  of  the  EEG  as  related  to  EMG  speech  patterns. 

(3)  Application  of  computer  pattern  recognition  techniques  to  deter- 
mine if  the  statistical  patterns  of  biological  activity  from  the 
EMGs  and  EEGs  are  similar  during  overt  and  covert  speech,  and 

to  attempt  to  machlne-ident if y silent  language  performance  with 
the  pattern  recognition  method. 


4 


Methods 


Genera  1 


Data  Collection.  The  data  have  been  collected  and  analyzed  as  follows. 
During  performance  of  a language  task,  integrated  ElMGs  from  surface-recordable 
muscles  of  the  human  face  involved  in  speech  production  are  recorded  along 
with  the  instantaneous  EEG  from  places  on  the  scalp  overlying  brain  regions 
involved  in  speech.  Integrated  EMGs , with  a time  constant  of  0.25  sec, 
were  chosen  over  instantaneous  EMGs  because  of  the  limited  band  pass  of  the 
recording  system  (a  Beckman  type  R Dynograph),  because  of  the  relative 
ease  of  quantification  of  EMG  activity,  and  because  integrated  patterns 
of  EMG  changes  related  to  speech  are  more  readily  identifiable  visually 
and  by  machine. 

The  language  tasks  are  described  in  detail  below.  In  brief,  they 
consist  of  the  human  subject  being  presented  visually  or  auditorily  with 
selected  individual  words  or  sentences,  in  response  to  which  the  subject 
speaks  the  word  or  sentence  overtly  or  reads  it  covertly  as  instructed. 

The  output  of  the  Beckman  recorder  is  simultaneous  on  eight  channels 
of  ink-writing,  moving  paper  (at  25  mm/sec)  and  on  an  Ampex  SP-300,  seven 
channel  analog  instrument  tape  recorder  with  EM  and  AM  capability.  The 
Beckman  chart  recording  is  used  for  real-time  monitering  of  signal  levels 
and  for  later  editing  of  the  Ampex  analog  tape.  Six  channels  of  EMG  and 
EEG  and  one  channel  of  voice  are  recorded  on  the  analog  tape. 

Data  Analysis.  Data  are  analyzed  in  three  ways.  The  first  is  direct 
overlay  tracings  of  the  EMG  outputs  on  the  Dynograph  recorder.  Overlays 
of  various  EMG  placements  are  visually  compared  for  within-word  pattern 
variability,  between  word  variability,  between  subject  variability, 
between  experimental  conditions  variability,  and  between  experimental 
sessions  for  reliability  of  patterned  EMG  production. 

Line -8 . The  second  and  third  methods  of  analysis  are  not  yet 
fully  operational,  but  are  proceeding  as  follows.  In  the  second  method, 
the  Ampex  analog  tape  is  edited  and  digitized  by  a Linc-8  computer;  the 
seven  channels  of  edited  data  are  then  stored  on  Line  tape  for  further 
processing.  A data  block  consists  of  five  seconds  of  recording:  the 
three  seconds  before  the  stimulus  word  presentation,  one  second  lor  the 
overt  or  covert  response,  and  one  second  of  post-response  activity.  Six 
months  have  been  spent  writing  the  Linc-8  programs  necessary  for  this 
data  digitization  and  storage  process;  the  program  is  now  complete,  and 
data  are  being  edited  and  stored. 

Several  other  processing  procedures  also  will  be  done  on  the 
Linc-8.  One  procedure  is  to  transfer  the  stored  edited  data  from  Line 
tape  to  a digital  tape  recorder  lor  further  processing  on  a CDC-6400 
computer  (see  below).  This  recorder  will  be  Installed  during  October; 
programs  for  the  data  transfer  are  now  being  written,  and  complete  transfer 
capability  of  edited  data  to  the  CDC-6400  should  exist  by  December, 
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This  will  reduce  our  turn  around  time  from  about  one  month  at  present  to 
no  more  than  one  day  (i.e.,  the  time  from  initial  data  collection  to 
complete  computer  analysis  and  some  pattern  recognition). 

Other  procedures  on  the  Linc-8  will  include  averaging  of  EMC 
and  EEC  responses  during  overt  and  covert  speech.  These  averaged  responses 
will  then  be  used  with  a simple  pattern  recognition  program  (now  being 
written),  whereby  a single  analog  response  to  a test  stimulus  word  will 
be  compared  with  the  averaged  response  of  a standard  stimulus  word.  The 
comparison  word  response  will  be  normalized  and  scaled  on  the  Linc-8  scope 
for  a visual  "best  fit"  to  the  standard  word  response.  Point-by-point 
comparison  of  the  two  displays  will  then  be  made  using  a variance  analysis. 
If  the  sum  of  the  point  variances  is  below  a certain  value  (determined 
by  the  variance  of  the  averaged  response) , then  the  comparison  response 
will  be  identified  as  belonging  to  the  same  word  as  the  averaged  response. 
The  reliability  of  this  method  can  then  be  checked  by  computing  an  error 
score  of  all  comparisons.  If  the  reliability  is  high,  this  procedure  will 
be  tried  on  the  CDC-B-lOO  for  pattern  recognition  of  the  analog  waveform. 

C DC-6400.  The  major  portion  of  data  analysis  will  consist  of 
calculating  the  various  statistics  for  the  EMC  and  EEC  data,  including 
cross-  and  auto-spectra,  linear  coherence,  and  weighted-average  coherence. 
These  statistics  will  then  be  used  in  the  clustering  procedure  for  pattern 
recognition.  Clustering  is  a means  of  grouping  data  so  that  similar  objects 
or  samples  fall  in  the  same  group  and  dissimilar  objects  or  samples  fall 
in  different  groups. 

ISODATA,  the  name  of  the  clustering  program,  uses  Euclidian  dis- 
tance as  the  measure  of  dissimilarity.  Thus,  objects  that  are  far  apart 
are  assigned  to  separate  clusters,  and  objects  that  are  close  together 
are  put  in  the  same  cluster.  The  values  of  the  variables  in  a data  set 
must  be  scaled  to  the  same  relative  order  of  magnitude  so  that  they  have 
equal  effect  in  the  clustering.  For  example,  the  distance  between  the 
points  at  coordinate  (1000,5)  and  (1100,6)  is  100.  No  significant  contri- 
bution is  added  by  the  lower-scaled  values  5 and  6. 

The  data  were  processed  both  with  and  without  scaling.  In  the 
output  of  the  EEC  and  EMG  spectral  analysis  programs,  it  is  observed  that 
the  amplitudes  are  much  higher  for  the  lower  frequencies,  from  0-4  hertz. 
Thus,  clustering  of  the  unsca led  spectral  data  is  based  mainly  on  these 
lower  frequencies.  In  scaling  the  data,  two  different  methods  have  been 
used.  First,  amplitudes  of  spectral  data  are  scaled  to  proportion,  so 
that  each  variable  (i.e.,  each  spectral  bln)  is  normalized  to  a standard 
deviation  of  "one"  over  all  words.  Second,  each  spectral  bin  is  scaled 
to  a standard  deviation  of  "one  over  all  pairs  of  recorded  data  channels 
and  all  stimulus  words.  Use  of  this  technique  with  the  actual  data  is 
described  below. 
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Subjects  and  Experiments 


Subjects  were  three  adult,  right-handed,  human  female  volunteers, 
ages  21—11,  hereinafter  designated  B,  C,  and  D.  A total  of  16  experimental 
sessions,  each  of  about  2j  hours  duration,  were  carried  out  under  two 
experimental  paradigms.  A given  session  for  a given  subject  is  identified 
by  the  subject’s  letter  code  and  her  chronological  session;  thus  C5  was 
the  fifth  experimental  session  for  subject  C.  Before  conducting  these 
sessions,  several  apparatus  debugging  sessions  were  carried  out  with  a 
fourth  subject,  A. 

Experiment  1 was  concerned  with  establishing  optimal  values  for  all 
experimental  parameters.  Thus,  various  electrode  placements  for  both 
EMG  and  EEG  were  used  to  determine  optimum  placements  for  obtaining  repro- 
ducible patterns  of  surface-recordable  muscle  activity  of  the  vocal  appai — 
atus  and  the  EEG  related  to  overt  speech  In  addition,  various  words  and 
sentences  suggested  by  our  psychophysioiogica 1 linguist  consultant  were 
employed  to  determine  the  best  language  task.  For  Experiment  1,  subjects 
B and  C were  run  four  sessions  each  and  subject  D for  two  sessions. 


Experiment  2 wns  a further  refinement  of  Experiment  1.  Only  those 
electrode  placements  used  in  Experiment  1 that  gave  optimal  results  were 
employed  for  further  recording.  Certain  words  from  the  language  task 
were  selected  that  would  most  likely  result  in  reproducible  E.MG  and  i;EG 
patterns.  Furthermore,  the  sessions  were  more  rigidly  structured,  each 
session  consisting  of  ten  repetitions  of  the  language  task.  For  Experiment  2, 
all  three  subjects  were  run  two  sessions  each. 


Electrodes  and  Electrode  Placements.  For  surface  recording  of  the 
EMG  from  facial  muscles  involved  in  speech  production,  Beckman  silver, 
silver-chloride  miniature  disk  skin  electrodes  (2-mm  exposed)  were  used. 

EEG  scalp  electrodes,  reference  electrodes,  and  the  ground  electrode 
were  Beckman  silver,  silver-chloride  standard  disk  skin  electrodes 
(8-mm  exposed).  Two  reference  sites  were  employed — the  skin  under  the 
left  mastoid  for  EMG  recordings  and  the  skin  under  the  right  mastoid  for 
EEG  recordings.  All  recordings  were  monopolar  in  order  to  record  absolute 
potentials  at  the  recording  site  and  te.  eliminate  in-phase  signals  common 
to  two  electrodes. 

Selected  skin  areas  were  first  cleansed  with  acetone  (alcohol  on  the 
face)  and  then  conditioned  with  Redox  electrode  paste  by  rubbing  it  into 
the  skin,  followed  by  a second  cleansing  of  acetone.  A conductive,  paste- 
filled  electrode  was  then  placed  over  each  recording  area  and  attached  by 
a sticky  collar  to  the  underlying  skin.  Following  a recording  session, 
electrodes  were  removed,  and  the  skin  was  cleaned  with  acetone  or  alcohol. 

Figure  1 shows  the  facial  musculature.  Muscles  involved  in  vocali- 
zation that  are  surface-recordable  are  2,  3,  4,  6,  7,  8,  9,  10,  11,  13, 
and  16.  Each  of  these  locations  was  tested  during  preliminary  experiments. 
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1 Orbicularis  oculi  m. 

2 Quadratus  labn  superions  m. 

(right) 

3.  ZvQomatic  head  of  quadratus 

labti  supenorii  m.  (right) 

4.  Zygornaticus  m.  (right) 

5.  Risonus  m (right,  cut) 

6 Triangularis  m.  (right) 

7.  Quadratus  labn  infenons  m. 
(right) 


6.  Mentahs  rT>. 

9.  Quadratus  labii  infenons  m. 
(left) 

10.  Triangularis  m.  (left,  cut) 

11.  Zygomaticui  m.  (left,  cut) 

12.  Quadratus  labn  supenons 

m.  (left,  cut) 

13.  Orbicularis  oris  m. 

14.  Caninus  m.  (left) 


15.  Buccinator  m,  (left) 

16  Depressor  septi  nasi  m 

17.  Nasalis  m.  (left) 

18.  Procerus  m 

19  Frontalis  m.  (left) 

20.  Frontalis  m,  (right) 

21.  Orbicularis  oculi  m 

(left) 

22.  Nasalis  m.  (right) 


FIGURE  1 MUSCLES  OF  THE  FACE  (AFTER  VAN  RIPER  AND  IRWIN,  1958) 


and  2,  9,  and  combined  sites  7/8  and  13/16  were  selected  for  Experiment  1 
as  representative  oi  the  muscles  most  used  in  speech  production  ol  the 
test  words  (see  below).  For  Experiment  2,  a single  electrixle  was  placed 
over  muscles  13  16  and  another  over  muscles  7/8  based  on  the  resu'ts  ol 
Experiment  1 for  best  EMG  speech-pattern  production. 

Figure  2 illustrates  the  10.  20  system  ctf  EEC  recording  (Penlield 
and  Jasper,  195*1).  Locations  F7 , T3 , C5 , F8 , T4  , C6 , and  T6  were  employed 
in  Experiment  1.  These  approximate  placements  over  cortical  areas  assumed 
to  be  involved  in  speech  production  (Penfield  and  Roberts,  1959)  as 
follows.  For  the  dominant  hemisphere,  F7  (Broca's  speech  area);  C5 , control 
of  vocalization  musculature;  T3  and  T5 , speech  organization  and  compre- 
hension were  used.  For  control,  the  nondominant  hemisphere  placements 
F8,  C6 , T-1 , and  T6  were  employed.  In  Experiment  2,  placements  F7,  T5 , 
and  C5  were  used  for  the  dominant  hemisphere  and  T6  for  the  control. 

Equipment . Apparatus  for  Experiments  1 and  2 was  the  same;  only 
the  electrode  placements  and  procedures  differed.  Electrodes  from  the 
facial  musculature  were  led  first  to  a Beckman  Model  9852A  EMG  integrator 
coupler,  with  a time  constant  of  0.25  seconds,  and  pass  band  of  20-5000 
hertz.  Experiment  1 used  channels  1,  2,  and  3 of  the  Dynograph  to  re- 
cord the  integrated  EMG.  EEG  electrodes  were  led  to  Beckman  type  9806A 
couplers,  with  a pass  band  of  2 to  30  cps;  channels  4,  5,  and  6 recorded 
the  instantaneous  EEG.  Channel  7 recorded  the  voice  output  of  the  micro- 
phone; channel  8 was  not  used. 

All  physiological  signals  were  preamplified  by  Beckman  Model  481B 
preamplifiers,  and  were  then  led  simultaneously  to  Beckman  Model  482A 
power  amplifiers  with  calibrated  zero  suppression,  and  to  an  Ampex  SP-300, 
seven  channel,  analog  instrument  tape  recorder.  The  output  of  the  Beckman 
power  amplifiers  drove  ink-writing  galvanometers  on  moving  chart  paper 
at  25  mm  per  sec.  The  output  on  the  chart  paper  could  be  set  by  a switch 
to  record  either  the  input  to  the  Ampex  tape  recorder  (i.e.,  'direct" 
recording)  or  the  output  of  the  Ampex;  this  feature  enables  the  investi- 
gator to  calibrate  and  monitor  the  permanent  tape  recording.  EMG  and  EEG 
recordings  were  on  channels  1 through  6 of  the  Ampex,  using  frequency 
modulation  at  1-7/8  inches  per  sec  (pass  band  dc-312  hertz).  Voice  was 
recorded  on  channel  7 using  amplitude  modulation  (50-3.5  kilohertz). 

Data  from  Experiments  1 and  2 filled  three  10-J-inch  Ampex  tapes  with 
analog  data.  Some  of  these  data  were  then  sent  through  the  data  analysis 
system  (see  above),  using  the  Linc-8  laboratory  instrument  computer, 
an  .XDS  930  computer,  and  finally  a CDC-6400  computer  (see  Data  Analysis 
section  below). 

Procedure 


Language  Task.  Experiment  1 used  16  words  and  three  sentences 
(Table  1)  as  the  language  tasks  (the  words  used  in  Experiment  2 arc  des- 
cribed under  that  section).  individual  words  were  chosen  on  the  basis 
of  several  criteria.  According  to  our  psycholinguist ic  consultant,  the 


Table  1 


PURE  VOWELS,  DIPHTHONGS,  AND  SENTENCES  USED  IN  EXPERIMENT  1 


Pure  Vowels 


No. 

Word 

Vowel  or  Diphthong 

Lips 

1 

heat 

ee 

Spread 

2 

hit 

1 

Spread 

3 

head 

e 

Open 

<1 

had 

ae 

Very  open 

5 

the 

uh 

Spread 

6 

bird 

er 

Bilabial 

7 

1 ather 

ah 

Spread 

8 

call 

aw 

Spread 

9 

put 

u 

Bilabial 

10 

cool 

oo 

Round 

11 

ton 

A 

Open 

Diphthongs 

1 

tone 

ou 

Round 

2 

take 

el 

Spread 

3 

might 

al 

B1 labial 

1 

shout 

au 

Round 

5 

toll 

ol 

Round 

Sentences 

"pimples  AND  rociL^RKS  MAR  PULCHRITUDE  IE  PROMINENT  ON- 
PARTS  WHiai  ARE  DISPLAYED  TO  THE  PUBLIv  " 

(16  bilablals,  unclerllnetl;  15  words) 


2.  "crank  phone  calls  can  irritate  one  although  they  are  not 

AS  DANGEROUS  AS  PHYSICAL  ASSAULT." 

(3  bilablals,  underlined;  13  words) 

3.  "now  is  the  TIME  KOR  ALL  GOOD  MEN  TO  COME  TO  THE  AID  OF 

THEIR  COUNTRY." 


main  aspects  of  vowel  production  that  have  recordable  consequences  from 
the  EMG  on  the  surface  of  the  face  area  are:  (1)  lip  rounding  and  (2)  lip 
opening.  The  lip  opening  action  is  most  marked  when  there  is  a preceding 
bilabial  or  bilabial  consonant  (e.g.,  p,  b,  m) , For  consonant  production, 
the  main  aspects  are:  (1)  lip  closure  (2)  lip  spreading  and  (3)  lip  rounding. 
Therefore,  for  this  experiment,  we  chose  words  that  were  most  likely 
to  have  these  characteristics,  and  a lew  words  that  did  not  have  them 
for  contrast. 

The  first  11  words  in  Table  1 represent  the  pure  vowels,  which  means 
that  their  quality  is  unchanged  throughout  the  syllables  in  which  they 
are  employed.  In  addition,  these  vowels  represent  the  different  tongue 
positions  for  the  principal  English  vowels  (Denes  and  Pinson,  1963),  and 
therefore  give  a change  in  vocal  musculature  without  changing  the  balance 
of  their  quality  in  a syllable.  Thus,  the  EMG  recordings  should  reflect 
differences  based  only  on  tongue  position,  and  whether  the  lips  are 
spread,  rounded,  unrounded,  or  bilablals  (see  Table  1 for  significance 
of  each  word).  The  last  five  words  in  Table  1 are  diphthongs,  whose 
qualities  do  change  from  the  beginning  to  end  in  the  syllable  in  which 
they  are  used,  thus  offering  a contrast  for  the  EMG  measures  of  pure 
vowel  pronunciation.  Tongue  movements  of  the  diphthongs  are  between  those 
for  pure  vowels.  Diphthongs  also  have  bllabials,  lips  rounded,  and 
spreading  lips. 

Similarly,  the  sentences  were  chosen  to  reflect  (1)  the  predominant 
use  of  bilablals,  (2)  the  predominant  use  of  nonbl labia  Is , and  (3)  a 
control  sentence.  In  addition,  each  word  of  each  sentence  was  spoken 
separately,  as  well  as  the  sentence  being  spoken  naturally,  to  compare 
the  EMG  pattern  of  each  word  in  the  sentence  in  its  "pure"  state  with 
its  EMG  pattern  when  the  word  is  proceeded  and  followed  by  another  word. 

After  electrodes  were  attached,  Ss  were  comfortably  seated  in  a semi- 
dark, electrically  shielded  booth.  All  electrodes  were  plugged  into  a 
junction  box  leading  to  the  Beckman  Dynograph  recorder.  Electrode 
resistances  were  checked;  if  one  was  found  to  be  greater  than  5000  ohms, 
it  was  removed,  the  skin  further  cleansed  and  conditioned,  and  the 
electrode  replaced.  When  all  electrodes  checked  correctly,  the  S was 
Instructed  in  the  experimental  procedure,  a microphone  for  recording 
speech  was  placed  in  front  of  the  S's  mouth,  and  a recording  session 
was  begun. 

Stimulus  Presentation.  Each  of  the  individual  words  and  each  sen- 
tence in  Table  1 were  printed  on  a 35-mm  slide  (white  on  black  to  reduce 
glare)  and  presented  to  the  subject  by  projecting  the  word  (or  sentence) 
on  a rear  projection  screen  about  3 feet  from  the  subject's  eyes.  The 
subtended  visual  angle  of  the  stimulus  and  its  Intensity  in  the  semi- 
darkened  room  were  chosen  to  avoid  squinting,  glare,  or  eye  strain.  The 
procedure  for  stimulus  presentation  was  as  follows. 

After  Installation  in  the  recording  chamber,  the  S was  Instructed 
that  she  was  to  relax  with  eyes  closed  while  the  polygraph  and  tape 
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recorder  gains  and  filters  were  adjusted  for  proper  EMC  and  EEC  recordings. 
During  that  period,  the  S was  to  say  her  name  when  asked  (to  calibrate 
EMC  gains  and  the  voice  channel)  and  to  open  or  close  her  eyes  when  asked 
(to  check  for  alpha  In  the  closed-eyes  EEC  and  alpha  blocking,  or  desyn- 
chronization, with  eyes  open).  Following  these  adjustments,  the  S was 
told  she  would  be  presented  with  a list  of  16  words,  one  at  a time,  for 
four  full  presentations,  plus  three  sentences  for  two  of  the  presentations. 
The  first  presentation  would  be  visual.  (The  S was  shown  a test  word  on 
the  screen  as  an  example.)  The  S was  to  sit  relaxed  with  her  eyes  closed. 

On  hearing  the  statement  "ready"  from  the  experimenter,  she  was  to  open 
her  eyes  and  look  at  the  screen.  In  2-3  sec,  a stimulus  word  would  be 
projected  on  the  screen  for  about  3 sec,  during  which  time  she  was  to 
read  the  word  aloud  into  the  microphone.  When  the  projected  word  was 
turned  off,  she  was  to  close  her  eyes  until  the  next  word  was  presented, 
and  wait  until  the  next  "ready"  signal. 

At  the  end  of  the  16  words,  the  three  sentences  were  presented  to 
her  twice  each,  one  at  a time.  On  the  signal  "ready,"  S was  to  open 
her  eyes  and  look  at  the  screen.  When  the  sentence  appeared  the  f irst 
time,  she  was  to  read  it  aloud  one  word  at  a time  at  the  same  speed  as 
the  preceding  individual  words.  On  the  second  presentation  of  the  sen- 
tence, S was  to  read  the  sentence  at  her  natural  speech.  The  second 
and  third  sentences  were  to  be  read  in  the  same  way.  (All  words  and  the 
three  sentences  were  presented  in  a random  order  to  obviate  any  antici- 
patory effects  in  the  EMC  and  EEC.) 

At  the  end  of  the  first  presentation,  the  S was  instructed  that  the 
same  series  would  be  shown  as  before,  except  that  this  time  she  was  to 
read  the  word  (or  sentence)  silently.  On  the  third  and  fourth  presentations, 
the  S was  instructed  that  only  the  16  words  would  be  presented,  not  the 
sentences,  but  that  this  time  the  words  would  be  called  to  her  by  the 
experimenter  (auditory  presentation) , and  she  was  to  repeat  the  entire 
list  with  her  eyes  remaining  open,  while  on  the  fourth  presentation  she 
would  keep  her  eyes  closed.  This  procedure  was  used  to  discriminate 
between  an  auditory  and  visual  stimulus  and  between  eyes  open  (a  desyn- 
chronized EEC),  and  eyes  closed  (a  synchronized  EEC). 

Subjects  B and  C participated  in  four  complete  sessions  each  over 
two  months  to  test  repeatability  within  a S and  variation  between  Ss. 

Subject  D was  given  two  complete  sessions  during  the  same  period.  Since 
only  seven  channels  of  recording  were  available  and  14  total  electrode 
placements  used,  several  trials  were  repeated  for  each  S using  those 
electrodes  not  previously  recorded  from. 

Results:  Experiment  1 


Genera  I 


Figure  3 illustrates  a typl':al  Beckman  Oynograph  recording  for  the 
period  of  visual  presentation  of  a single  stimulus  word  ("TOIL"),  and  the 
overt  response  during  the  third  session  for  subject  C.  Although  such  a 
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raw  record  does  not  provide  much  Information,  several  items  of  interest 
may  be  noted  that  arc  generally  present  for  all  stimuli  and  overt  responses 
in  both  Experiments  1 and  2. 

First,  note  in  channels  4,  5,  and  6 that  the  EEC  is  synchronized  at 
about  9 hertz  (alpha  rhythm)  one  sec  before  the  ready"  signal  (shorter 
bracketed  area).  However,  following  the  ready  signal,  this  synchronization 
is  significantly  reduced.  Shortly  after  the  stimulus  word  is  presented, 
and  before  the  beginning  of  the  vocal  response,  all  three  EEG  traces  be- 
come completely  desynchronized  (alpha  'blocking"),  and  remain  so  until 
the  eyes  are  closed  about  ^ sec  after  vocalization  is  complete.  This 
means  that  under  these  conditions,  the  EEG  spectra  during  speech  will 
consist  of  primarily  "fast"  frequencies  (roughly  15-30  hertz). 

Second,  note  the  time  relation  between  the  physiological  measures 
and  the  complete  vocal  response  (longer  bracketed  area).  In  channels 
1,  2,  and  3,  the  integrated  EMG  begins  to  Increase  about  three-fifths 
of  a second  before  actual  voice  production  (channel  7)  , and  continues 
for  a short  time  after  the  vocalization  ends.  In  the  EEG,  coincident 
with  the  onset  of  the  EMG  increase,  there  is  a large  "slow-wave,"  nega- 
tive/positive potential.  Although  not  always  present,  it  does  occur 
frequently.  Because  this  negative/positive  shift  (largest  in  channel  4) 
is  so  large,  because  it  occurs  over  a second  after  the  visual  stimulus 
comes  on,  and  because  it  is  not  always  present,  it  cannot  be  considered 
a visually  evoked  response.  This  is  clearly  evident  when  comparing 
Figure  3 with  Figures  4,  5,  and  6,  which  illustrate  the  same  response 
("toil")  with  audio  presentation  (eyes  open  and  closed  and  overt  response) 
and  during  silent  reading  (covert  response). 

Other  comparisons  of  note  between  these  figures  are  the  EEG  during 
audio  presentation,  overt  response,  eyes  open  (Figure  4),  versus  eyes 
closed  (Figure  5).  In  Figure  4,  desynchronization  of  the  EEG  is  much 
the  same  as  in  Figure  3,  whereas  in  the  eyes  closed  condition  (Figure  5) 
desynchronization  is  only  slightly  decreased  in  channels  5 and  6 and  not 
at  all  in  channel  4.  Therefore,  the  spectra  between  eyes  closed  and  eyes 
open  should  be  different.  Nevertheless,  the  temporal  characteristics 
of  the  physiological  responses  relative  to  the  onset  of  vocalization 
discussed  above  are  present  in  both  conditions,  including  the  slow-wavei 
negative/positive  potential  in  the  EEC. 

In  Figure  6,  during  visual  presentation  but  with  a covert  response, 
there  is  relatively  little  change  in  the  EMC  (the  gain  of  channels  1,  2, 
and  3 is  100  times  that  in  Figures  3,  4,  and  5).  In  the  EEG,  however, 
the  slow-wave  potential  still  appears  (between  brackets),  although  not 
as  sharp  as  during  actual  vocalization.  Since  there  is  little  or  no  EMG 
activity,  even  at  high  gain,  this  EEG  potential  cannot  be  caused  by  muscle 
action.  Also,  since  the  temporal  realtion  between  this  slow-wave  and  the 
onset  of  vocalization  in  Figures  3,  4,  and  5 is  consistent,  then  the 
probable  onset  of  silent  reading  can  be  predicted  from  Figure  6.  Of  course, 
the  EEG  of  Figure  6 contains  the  same  temporal  relations  of  synchronization 
and  desynchronization  relative  to  the  ready  signal,  onset  of  the  stimulus. 
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FIGURE  4 DYNOGRAPH  RECORDING  OF  RESPONSE  TO  AUDITORY  STIMULUS,  OVERT 
RESPONSE,  EYES  OPEN.  ElactrodM  and  lymbolt  th«  samt  as  Figura  3. 
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TIME  MARKERS  (1  Hc) 

FIGURE  5 DYNOGRAPH  RECORDING  OF  RESPONSE  TO  AUDITORY  STIMULUS,  OVERT 
RESPONSE,  EYES  CLOSED.  EltctrodM  and  symbols  th«  sama  as  Figure  3. 


I 


CHANNEL 


SUBJECT  C3 


TIME  MARKERS  (1  (kI 

FIGURE  6 DYNOGRAPH  RECORDING  OF  RESPONSE  TO  VISUAL  STIMULUS.  EYES  OPEN, 
COVERT  RESPONSE  (SILENT  READING).  EIvctrodw  and  lymbolt  th«  same 
at  Figure  3. 


and  the  clositiR  of  eyes  followinK  the  reading  as  was  lound  in  Figure  3, 

■1  , and  5.  This  at  least  suggest  that  for  the  eyes-open  condition,  the 
EEG  for  covert  speech  should  have  a spectra  very  similar  to  that  of  the 
EEG  for  overt  speech. 

Finally,  in  comparing  the  EMGs  of  Figures  3,  4,  and  5 1 or  the  overt 
response  "TOIL,"  although  marked  differences  in  the  patterns  exist,  there 
are  also  obvit)us  similarities.  These  may  be  a result  of  differences  in 
the  actual  pronunciation  of  the  word,  as  is  suggested  by  the  differences 
in  pattern  of  the  vocalization  output  shown  in  channel  7. 

EMG  Analysis 

As  described  above,  initial  analysis  of  EMG  records  has  been  done 
by  visually  comparing  direct  overlay  tracings  of  the  Dynograph  outputs. 
Figure  7 illustrates  the  comparison  of  three  EMG  electrode  outputs  foi- 
Ss  B and  C for  the  same  overtly  spoken  word  ("TAKE")  on  two  different 
experimental  days  each.  Thus,  comparisons  can  be  made  within  a S on 
two  different  occasions,  between  two  different  Ss  for  the  same  word, 
and  between  three  sets  of  muscles  for  the  same  S and  word.  Figure  8 
illustrates  the  comparison  between  two  different  but  similar  sounding 
words  ("TO.NE"  and  ' TON")  for  the  same  S during  the  same  recording  session. 


A comparison  of  the  EMG  records  for  subject  C in  Figures  7 and  8 
show  that  the  EMG  patterns  are  more  consistent  from  one  session  to 
another  for  the  same  word  ("TAKE")  than  between  two  similar  words 
("tone"  and  "TON")  recorded  in  the  same  session.  Nevertheless,  the 
EMG  patterns  for  "TAKE"  (Figure  7)  are  not  as  consistent  for  subject  C 
on  two  different  occasions  as  might  be  expected  a priori . This  may  be 
due  to  differences  in  electrode  placement  between  the  two  sessions,  since 
in  Experiment  2 and  with  subject  B in  Experiment  1 (Figure  7) , a much 
closer  pattern  repeatability  is  shown,  perhaps  because  of  better  electrode 
placement.  Note  also  in  Figure  7 that  while  subject  B's  EMG  response.s 
of  take"  on  two  different  occasions  are  fairly  similar,  they  are  quite 
different  from  subject  C’s,  verifying  that  the  physiological  response 
patterns  are  unique  for  each  individual. 

Figure  9 compares  the  EMG  patterns  for  subjects  C and  B between 
sessions  for  sentence  3.  The  results  are  essentially  the  same  as  lor 
the  individual  words — namely,  that  each  individual's  response  is  unique, 
and  is  more  similar  from  one  session  to  another  within  a S than  between 
Ss  within  sentences  or  between  sentences  within  a S and  within  a session. 

By  thus  comparing  the  E.MG  responses  to  all  words  lor  the  three  Ss  for  all 
conditions  and  over  all  sessions,  it  was  found  that  EMG  patterns  for  the 
muscle  groups  mc-asured  by  electrodes  7/8  and  13/16  most  consistently 
reproduced  a pattern  for  the  same  word. 

Finally,  Figure  10  (bottom)  shows  the  reconstruction  of  the  EMG  ' 

response  of  sentence  3 from  the  separate  responses  to  the  individual 

words.  Comparison  of  the  reconstruction  with  the  naturally  spoken  sen-  ‘ 

tence  (upper  portion  of  Figure  10)  illustrates  that  the  pattern  of  each  { 
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FIGURE  9 EMG  PATTERNS  FOR  TWO  SUBJECTS  ON  TWO  SEPARATE  SESSIONS  FOR  A SENTENCE 
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FIGURE  10  RECONSTRUCTION  (BOTTOM)  OF  A SENTENCE  FROM  EMG  RESPONSES  TO  INDIVIDUAL  WORDS 
COMPARED  WITH  THE  EMG  RESPONSES  FOR  THE  NATURALLY  SPOKEN  SENTENCE  (UPPER) 


word  spoken  separately  is  recognizable  in  the  naturally  spoken  sentence, 
even  though  in  the  latter  each  word  is  preceded  and  followed  by  another 
word.  This  suggests  that  individual  words  may  be  recognized  within  a 
sentence  by  the  computer  pattern  recognition  procedure. 

KEG  Analysis 

The  first  portion  of  data  collection  and  analysis  is  to  edit  and 
digitize  the  data  recorded  on  the  analog  tape  by  the  Linc-H  computer; 
however,  programs  for  doing  this  have  only  recently  been  completed 
(see  section  Methixfs,  General,  above).  Therefore,  to  begin  analysis  on 
tile  CDC-6-100  as  soon  as  possible,  a time-consuming  collateral  path  was 
used  to  digitize  the  tapes  of  Experiment  1 (all  Experiment  2 tapes  arc- 
now  being  edited  on  the  Linc-8) . This  involved  playback  of  the  analog 
tape  on  a different  model  tape  recorder  (Ampex  FR  1300)  than  was  used 
on  initial  recording,  writing  programs  for  digitizing  an  entire  tape  on 
a .\DS  930  computer,  and  finally  writing  a separate  program  on  the  CI)C-6100 
computer  for  locating  the  data  for  particular  words.  In  addition,  a 
statistical  analysis  program  supplied  by  Dr.  Gary  Galbraith  of  the 
University  of  Southern  California  had  to  be  debugged  and  calibrated  on 
the  CUC-6‘100  to  analyze  the  data  of  the  selected  words.  Five  data  words 
were  selected  for  initial  analysis  and  the  various  statistics  for  the 
EEG  and  EMG , were  computed.  These  compiled  data  were  then  analyzed  lor 
pattern  content  in  the  clustering  program.  The  sampling  rate  for  the 
digitization  was  set  at  2500  samples/sec/channel  (seven  channels),  and 
the  analog  tape  was  played  back  at  30  ips,  providing  an  effective  sampling 
rate  per  channel  of  156  samples  per  sec.  This  allows  frequency  resolution 
of  0-78  hertz.  Since  the  EEG  signals  below  about  2 hertz  are  not  signi- 
ficant for  this  study,  the  low  end  of  the  spectrum  may  be  ignored  for 
spectral  analysis  and  clustering  of  the  EEG  data. 

Following  several  digitization  sessions,  data  for  the  five  selected 
words  (Table  2)  were  prepared  for  the  clustering  analysis.  Data  selection 
was  based  on  the  appearance  of  the  raw  Dynograph  record  for  reproducibility, 
so  that  the  number  of  occurrences  were  included  in  the  clustering  analysis. 

Table  2 

WORDS  SELECTED  FOR  CLUSTER I.NG  A.NALYSIS, 

NUMBER  OF  OCCURRENCES,  AND  SEQUENCE  OF  OCCURRENCE 


Word 

Number  of  Occurrences 

Sequence  of  Occurrence 

HIT 

6 

1,  6,  9,  11,  16,  20 

COOL 

5 

2,  7,  12,  17,  21 

PUT 

A 

3,  10,  13,  18 

MAD 

A 

A,  8,  14,  19 

HEAD 

2 

5,  15 

21 

24 

Bvlorc  calculating  the  various  statistics  to  be  used  in  the  clus- 
tering program,  several  waveiorm  plots  ol  the  digitized  data  were  obtained 
on  a CDC-280  microlilm  plotter  connected  to  the  CDC-6'100.  This  was 
necessary  to  show  that  the  digitized  data  conlorm  to  the  raw  Dynograph 
record.  Figure  11  illustrates  such  a plot  ol  all  seven  channels  lor  the 
word  "hit"  in  session  1 for  S C,  the  ordinate  being  arbitrary  amplitude 
and  the  abcissa  the  time  lor  256  samples. 

The  physiological  data  of  the  selected  words  were  then  fed  into 
the  spectral  analysis  pivjgram  to  obtain  frequency  components,  cross 
spectra,  linear  coherence,  and  the  weighted  average  coherence  functions. 
These  statistics  each  may  contain  features  for  recognition  of  the  physio- 
logical components  ol  speech,  and  so  were  input  to  the  clustering  program 
in  various  combinations.  In  theory,  if  the  quality  of  information  contained 
in  the  spectral  output  has  "recognizable  features,"  then  the  clustering 
program  is  capable  of  classifying  the  biological  potentials  according 
to  the  selected  words.  One  ol  the  main  problems  with  this  analysis, 
as  with  any  cluster  analysis  on  exploratory  data,  is  the  relative  scaling 
of  the  33  frequency  bin  amplitudes  in  the  spectral  output.  Even  if  the 
signals  have  "features,"  they  may  be  lost  if  the  choice  of  a scale  is 
Incorrect.  For  this  reason,  the  data  were  normalized  (as  explained  in 
section  Methods,  General),  this  being  the  least-biased  relative  scaling 
method . 

The  Initial  clustering  was  carried  out  on  the  21  word  occurrences 
based  on  the  spectral  content  for  frequency  and  amplitude  of  the  EEC  plus 
EMG , nonscaled;  foi-  the  frequency  and  amplitude  of  the  EEC  alone,  nonscaled; 
and  lor  the  frequency  and  amplitude  of  the  EMG  alone,  nonscaled.  This 
was  then  repeated  with  the  data  scaled  in  two  different  ways  as  previously- 
described.  An  example  of  the  results  (for  the  EMG  only  condition)  are 
shown  in  Table  3.  Eight  separate  clusterings  were  found  by  first  placing 
all  points  in  one  cluster,  then  partitioning  this  into  two  clusters,  and 
so  on  until  eight  clusters  were  obtained.  These  eight  were  then  "lumped" 
one  cluster  at  a time  until  only  two  clusters  remained.  Note  in  this 
clustering  that  all  the  words  beginni.ig  with  an  "h"  are  grouped  in  the 
first  cluster  at  the  two-cluster  level. 

A measure  ol  how  well  the  partitioning  into  clusters  describes  the 
data  is  the  distance  a given  point  in  a cluster  is  from  the  cluster  center. 
This  distance  can  be  considered  as  a "error"  score  lor  each  data  point 
in  the  cluster.  As  an  overall  statistic  for  clusteredness,  we  use  the 
sum  ol  all  ol  the  squared  errors  in  a cluster,  measured  as  a percentage 
ol  the  total  squared  error  of  cluster  1 (by  definition,  100%).  The 
results  on  the  five  selected  words  are  Indicated  In  the  fourth  columi»  of 
Table  3,  and  are  also  plotted  versus  the  number  of  clusters  in  Figure  12 
as  the  cluster  characteristic  curve  for  these  words.  Each  point  on 
this  curve  shows  the  error  or  variance  caused  by  using  only  a certain 
.small  number  of  clusters,  rather  than  using  one  cluster  for  each  word  or 
data  object.  Note  that  this  latter  condition  would  represent  zero  error, 
since  each  data  object  would  be  its  own  cluster.  We  arbitrarily  define 
the  one  cluster  (the  overall  average  of  the  data)  error  to  have  a value 
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ARBITRARY  AMPLITUDE 


T«bl«  3 


1 

I 


I 
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21  WORDS  FROM  SESSION  Cl,  EMC  CHANNELS  ONLY  (CHANS  1-3) 
Bins  3-33  of  Frwquwncy  Data  Not  Scalsd 


I ter 

Number  of 

Points  in 

% Error 

Words  In  the  Cluster  and  Frequency  of  Occurrence 

Cluttterit 

Cluster 

1 

1 

21 

itxn 

All  points  in  one  cluster 

2 

2 

1-15 

56.5532 

HIT  (6>,  HAD  (4),  HEAD  (3),  COOL  (3) 

2-6 

COOL  (3) , PUT  (4) 

3 

4 

1-8 

32. 1759 

HIT  (3),  HAD  (3),  HEAD  (1),  COOL  (2) 

2-4 

COOL  (2).  PUT  (2) 

3-2 

PUT  (2) 

4-7 

HIT  (4),  HAD  (1),  HEAD  (1),  COOL  (1) 

4 

8 

1-4 

13.9114 

HIT  (1),  HEAD  (I),  COOL  (1),  HAD  (1) 

2-3 

COOL  (1),  PUT  (2) 

3-1 

PUT  ( 1) 

4-4 

HAD  (1),  HEAD  (1),  HIT  (1),  COOL  (1) 

5-1 

PUT  (1) 

6-1 

COOL  (1) 

7-3 

HIT  <3) 

8-4 

HAD  (2),  HIT  (1),  COOL  (1) 

S 

7 

1-8 

15.2311 

HIT  (2),  HAD  (3),  HEAD  (1),  COOL  (2) 

2-3 

COOL  ( 1) , PUT  (2) 

3-1 

PUT  <1> 

4-4 

HAD  (1),  HEAD  (1),  HIT  (1),  COOL  <1) 

5-1 

PUT  (1) 

6-1 

COOL  (1) 

7-3 

HIT  (3) 

6 

6 

1-8 

17.3050 

HIT  (2),  HAD  (3),  HEAD  (1),  COOL  (2) 

2-3 

COOL  (1),  PUT  (2) 

3-1 

PUT  (1) 

4-7 

HIT  (4),  HAD  (1),  HEAD  (1),  COOL  (1) 

5-1 

PUT  (1) 

6-1 

COOL  (1) 

7 

i 

1-15 

23.7602 

HIT  (6),  HAD  (4),  HEAD  (2),  tXX)L  (3) 

2-3 

COOL  (1),  PUT  (2) 

3-1 

PUT  (1) 

4-1 

PIT  (1) 

5-1 

COOL  (1) 

8 

4 

1-15 

29.0901 

HIT  (6),  HAD  (4),  HEAD  (2),  COOL  (3) 

2-4 

COOL  (2)7  PUT  (2) 

3-1 

PUT  (1) 

4-1 

PUT  (1) 

tf 

3 

1-19 

57  9038 

HIT  (S>,  COOL  (S),  HAD  (4),  HEAD  (2),  PUT  (3) 

2-1 

• 

PUT  (1) 

3-1 

PUT  (1) 

10 

2 

1-20 

72. 7829 

HIT  (6),  COOL  (5),  HAD  (4),  HEAD  (2),  PUT  (3) 

2-1 

PUT  (1) 

4 

* 

i 
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01  2345678 

NUMBER  OF  CLUSTERS 

FIGURE  12  CHARACTERISTIC  CLUSTER  CURVE  FOR  THE  EIGHT  j 

CLUSTERS  FOR  SUBJECT  Cl  (From  Table  3) 

t 


of  100%.  The  cluster  characteristic  curve  shows  how  much  the  error  drops 
as  more  clusters  are  used  to  describe  the  data. 

Thus,  Table  3 and  Figure  12  suggest  that  the  five  selected  words 
may  be  grouped  and  compared  with  a theoretically  perfect  clustering. 

That  is,  are  there  five  clusters,  each  containing  all  incidences  of  a 
particular  word?  In  this  manner,  it  ideally  might  be  found  that  cluster 
1 of  the  five  clusters  contains  all  the  occurrences  for  the  word  HIT, 
cluster  2 for  the  word  COOL,  cluster  3 for  the  word  PUT,  cluster  4 for 
the  word  HAD,  and  cluster  3 for  the  word  HEAD.  With  additional  analysis 
from  Experiment  2,  with  its  more  rigidly  structured  data  collection  we 
expect  the  quality  of  the  data  to  improve,  and  therefore  the  clustering 
to  be  more  selective. 

Results:  Experiment  2 


General 


Experiment  2 was  designed  to  refine  Experiment  1 to  obtain  more 
accurate  data  that  might  qualify  for  the  clustering  pattern  recognition 
program.  First  it  was  decided  to  choose  fewer  words  than  previously 
employed,  and  to  select  words  that  had  the  greatest  likelihood  of  repro- 
ducing consistent  EMC  patterns.  Second,  words  were  repeated  ten  times 
during  each  of  two  recording  sessions  for  each  of  the  three  Ss , all 
under  the  visual  presentation  condition  of  Experiment  1.  Third,  electrode 
sites  were  restricted  to  the  six  that  could  be  recorded  simultaneously. 
Including  two  EMC  and  four  EEC  (see  Methods  Apparatus,  Electrode  Place- 
ments). Fourth,  five  additional  bisyllablc,  phonetically  balanced  words 
were  added  to  the  language  task,  with  the  accent  first  on  one  syllable 
and  then  on  the  other.  The  words  and  the  sentence  that  was  overtly  read 
at  the  end  of  each  word  list  are  shown  in  Table  4.  The  15  words  were 
chosen  to  emphasize  rounded  lips,  bilabials,  and  open  lips  in  the  case 
of  the  monosyllables,  and  to  asses  the  contribution  of  the  lead  part 
of  a bisyllablc  word  on  the  second  part  (and  vice  versa)  when  one 
syllabic  is  accented.  Finally,  no  covert  responses  were  obtained  in 
Experiment  2. 

Figure  13  illustrates  the  raw  record  of  the  word  "COOL"  by  Subject  C 
on  session  6.  This  record  is  much  like  that  of  Figure  3,  except  that 
now  there  arc  two  integrated  EMC  channels  and  four  EEC  channels.  The 
stimulus  response  paradigm  is  also  the  same  as  in  Figure  3 — namely,  that 
the  S is  resting  with  eyes  closed  before  a visual  word  presentation. 

On  a "ready"  signal,  she  opens  her  eyes  and  attends  to  the  screen.  On 
stimulus  presentation,  she  overtly  reads  the  word  into  the  microphone 
and  then  closes  her  eyes  again  to  await  the  next  word. 

Note  in  Figure  13  that  the  same  general  results  were  found  as 
described  above  for  Experiment  1.  That  is,  the  EEC  is  synchronized 
until  after  the  eyes  are  opened  following  the  ready  signal,  and  then 
remains  desynchronized  until  the  eyes  are  closed.  Second,  the  DilG 
begins  to  increase  about  2/5-3/S  sec  before  the  actual  vocalization. 
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Table  <1 


LANGUAGE  TASK  FOR  EXPERIMENT  2 


Blsy liable 


Monos y liable 

Aeeent  First  Syllable 

Aeeent  Seeond  Syllable 

TIP 

BLACKBOARD 

BLACKBOARD 

HIT 

SCHOOLBOY 

SCHOOLBOY 

HAD 

COUGHDROP 

COUGHDROP 

PUT 

SHIPWRECK 

SHIPWRECK 

COOL 

MOUSETRAP 

MOUSETRAP 

Sentence: 

THE  SHIPWRECKED 

SCHOOLBOY  HAD  PUT  A COOL 

COUGHDROP  IN  THE  MOUSETRAP 

AND  AIMED  IT  TO 

HIT  AND  TIP  OVER  THE  BLACKBOARD. 
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•READY" 


STIMULUS  VOCAL  RESPONSE 
ON  "COOL" 


FIGURE  13  DYNOGRAPH  RECORDING  OF  STIMULUS- RESPONSE  OF  EMG  AND  EEG 
FOR  THE  WORD  "COOL" 
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Third,  in  all  EEG  channels,  but  especially  in  the  dominant  (speech) 
hemisphere  (in  particular,  over  Broca's  area,  electrode  F7) , the  slow- 
wave,  negative,  positive  potential  is  present  with  the  onset  of  the 
EMG  increase.  Interestingly,  for  this  subject,  this  slow-wave  appears 
again  in  the  EEG  following  the  end  of  the  EMG  changes  and  just  before 
the  eyes  are  closed.  The  significance  of  this  result  is  unknown  at 
this  time,  but  if  this  potential  remains  consistent  for  covert  responses 
as  well  (as  it  did  in  Experiment  1),  it  may  serve  as  a feature  detector 
in  the  cluster  analysis  program. 

EEG  cluster  analysis  of  these  data  is  awaiting  tape  editing  and 
digitization  on  the  Linc-8,  which  should  be  completed  by  Uecember. 

EMG  Analysis 

Figure  14  shows  the  results  of  multiple  (.\=5)  EMG  tracings  of  the 
15  stimulus  words  for  the  fifth  session  of  Subject  C.  The  vertical 
line  in  each  set  of  tracings  marks  the  beginning  of  vocalization.  These 
multiple  tracings  show  that  the  E.MG  variability  for  a given  muscle 
group  and  for  a given  word  is  significantly  less  than  that  between  words 
and  between  muscle  groups.  They  also  show  that  the  temporal  variability 
is  slightly  larger  than  the  magnitude  variability.  Some  words  like 
PUT,  MOUSETRAP,  BLACKBOARD  show  much  less  variability  than  others. 

In  any  event,  both  the  temporal  and  amplitude  variability  within 
a word  and  within  a muscle  group  are  sufficiently  low  that  an  average 
response  may  confidently  be  drawn  to  represent  the  EMG  response  to  a 
given  word.  Such  average  curves  from  Figure  14  are  drawn  in  Figure  15 
(or  C5 . The  averages  of  bisyllablc  words  for  C5  may  be  compared  with  the 
bisy liable  words  for  B5  and  B6  (Figures  16  and  17)  for  between  S varia- 
bility, and  between  B5  and  B6  for  within  S variability.  Such  a comparison  ^ 

reveals  again  that  the  E.MG  response  for  a given  S is  unique;  however, 
the  responses  between  subjects  C and  B for  the  words  "COUGHDROP, " "BLACK 
BOARD,"  "blackboard, " and  "SHIPWRECK"  are  fairly  similar.  Comparisons 
within  a subject  between  sessions  (Figures  16  and  17)  shows  relatively 
small  variability,  while  comparisons  between  words  accented  on  the  first 
syllable  and  those  accented  on  the  second  shows  less  variability  than 
might  be  supposed  a priori . 

Finally,  Figure  18  Illustrates  multiple  tracings  for  Subject  B,  on 
her  llfth  session,  for  the  sentence.  Note  that  even  here,  where  each 
sentence  may  be  spoken  at  a slightly  different  rate,  the  magnitude 
variability  is  relatively  small,  and  even  the  temporal  variability  is 
low  during  the  first  portion  of  the  sentences.  Furthermore,  a comparison 
of  the  sentence  components  of  Figure  18  with  the  overt  response  of  the 
individual  words  of  Figure  16  shows  that  roughly  80%  of  the  EMG  patterns  [ 

within  the  sentence  can  be  picked  out  visually  by  knowledge  of  the  average  | 

response  of  the  single  word.  . 
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ELECTRODE 


FIGURE  14  MULTIPLE  (N  = 5)  TRACINGS  OF  THE  EMG  PATTERNS  FOR  15  STIMULUS  WORDS,  SESSION  5, 
SUBJECT  C.  Vertical  tines  indicate  beginning  of  vocalization. 


ELECTRODE 


FIGURE  15  AVERAGE  OF  EMG  RESPONSES  OF  MULTIPLE  TRACINGS  OF  FIGURE  14 


ELECTRODE 


FIGURE  16  AVERAGE  EMG  RESPONSES  FOR  BISYLLABIC  WORDS  FOR  SUBJECT  B,  SESSION  5,  COMPARING 
ACCENTS  ON  THE  SECOND  SYLLABLE  WITH  ACCENTS  ON  THE  FIRST  SYLLABLE 
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FIGURE  17  SAME  AS  FIGURE  16  FOR  SUBJECT  B,  SESSION  6.  Compare  with  Figure  16  for  within-subject  variability 


Discussion  aiui  Conclusions 


The  objectives  ol  the  1 irst  year  of  this  project  were  to  establish 
the  validity  ol  the  basic  premise  that  patterns  ol  biological  information 
can  be  related  to  covert  lanKua^e  beliavior  (thought).  We  might  restate 
this  by  asking  several  specific  questions: 

(1)  Are  integrated  EMG  patterns  of  spoken  words  consistent  within 
words  and  over  time  within  a subject?  If  so,  are  they  unique 
tor  a given  subject  lor  a given  word?  If  so,  do  they  follow 
linguistic  laws? 

(2)  What  statistic  of  the  EEC  corresponds  sufficiently  with  the 
EMG  pattern  ol  speech  to  use  as  a feature  detector  in  a pattern 
recognition  program  for  a given  word? 

(3)  Given  this  EEG  statistic,  can  it  distinguish  one  spoken  word 
from  another?  Can  it  identify  a given  covert  word?  Can  it 
select  a word  from  an  overt  or  covert  sentence  containing 
the  word,  even  when  different  words  precede  and  follow  the 
test  word? 

The  results  ol  the  first  six  months  of  the  research  reported  above 
for  Experiments  1 and  2,  although  not  conclusive,  support  an  affirmative 
response  for  the  first  set  of  questions.  The  significant  results  may  be 
summarized  as  follows: 

(1)  The  EMG  patterns  for  each  word  used  thus  far  in  this  research 
are  specific  for  that  word. 

(2)  The  EMG  patterns  for  a given  word  are  consistent,  showing  less 
within  subject  variability  than  between  subject  variability. 

(3)  Amplitude  variability  of  an  EMG  pattern  for  a given  word  is 
mcKlerate,  but  slightly  more  than  the  temporal  variability. 
However,  both  types  of  variability  are  sufficiently  small 
so  that  an  average  pattern  can  be  obtained  that  reliably 
represents  the  word. 

(1)  The  average  EMG  response  of  a given  word  lor  a given  subject 
can  be  used  as  a template  to  identify  the  same  EMG  response 
lor  that  word  when  the  word  is  imbedded  in  a sentence. 

(5)  The  variability  of  bisyllabic  words  between  those  accented  on 
the  first  syllable  and  those  accented  on  the  second  is  greater 
than  the  within  variability  for  a given  bisyllabic  word,  and 
greater  than  the  variability  for  monosyllabic  words.  However, 
this  accent  variability  is  still  sufficiently  small  so  that 
either  accented  word  may  be  used  to  identify  the  same  unaccented 
bisyllabic  word  when  it  is  imbedded  in  a sentence. 
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Answers  to  the  set  ol  questions  on  the  EEC  are  still  not  complete. 

We  are  assuming  for  now  that  if  the  EMG  patterns  are  consistent  for  the 
overt  response,  as  they  are,  then  the  EEG  for  the  covert  response  should 
be  similar  to  that  ol  the  overt  response  to  the  same  word.  Evidence 
from  the  raw  record  EEGs  both  with  respect  to  frequency  differences  of 
synchronized  v'--sus  desynchronized  patterns  and  the  existence  of  the 
slow-wave,  negative  positive  potential  for  the  covert  and  overt  responses 
suggest  that  such  a correlation  is  quite  possible.  In  addition,  the 
pattern  recognition  program  so  far  has  been  able  to  select  those  words 
b*'ginning  with  an  "n"  from  other  words  in  Experiment  1,  using  clustering 
of  several  EEG  statistics.  These  statistics  are  the  cross-  and  auto- 
spectra, the  linear  coherence  function,  and  the  weighted-average  coherence. 
The  one  that  will  serve  best  as  a feature  detector  is  yet  to  be  determined. 

During  the  next  six  months,  with  the  improved  Linc-8  tape-editing  and 
digitizing  of  the  data  of  Experiment  2,  we  expect  to  answer  the  questions 
involving  the  EEG.  By  December  1,  our  turnover  time  for  data  analysis 
should  be  reduced  to  one  day.  Based  on  our  results  with  the  E.\IG,  we  are 
fairly  confident  that  we  will  be  able  to  establish  definitively  whether 
the  EEG  alone  can  provide  us  with  sufficient  information  to  identify 
language  behavior. 

Lawrence  R.  Pinneo 

Manager,  Neurophysiology  Program 

David  J.  Hall 

Senior  Research  Engineer 
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